-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix](parquet-reader) Fix parquet reader crash in set_dict(). #40643
[Fix](parquet-reader) Fix parquet reader crash in set_dict(). #40643
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 38088 ms
|
TPC-DS: Total hot run time: 197465 ms
|
ClickBench: Total hot run time: 32.07 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
…#40643) ## Proposed changes ### Issue ``` *** is nereids: 1 *** tablet id: 4 Abort at 1725864966 (unix time) try "date -d @1725864966" if you are using GNU date *** *** Set a breakpoint in static void __GI_abort() to debug *** PC: @ 0x7f007fb4090a04 *** SIGSEGV (address not mapped to object 0xa0fa868a41d6) received by PID 404737 (TID 274135 OR 0x7ece29df700) from PID 1755584205; stack trace: *** #0 __GI_raise #1 __GI_abort apache#2 sig_handler apache#3 _sigaction apache#4 JVM_handle_linux_signal apache#5 _sigaction apache#6 doris::vectorized::ByteArrayDictDecoder::set_dict(std::unique_ptr<unsigned char[], std::default_delete<unsigned char[]>> &&, int, unsigned long) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/byte_array_dict_decoder.cpp:41 apache#7 doris::vectorized::ColumnChunkReader::_decode_dict_page() at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_column_chunk_reader.cpp:258 apache#8 doris::vectorized::ColumnChunkReader::next_page() at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_column_chunk_reader.cpp:105 apache#9 doris::vectorized::ParquetColumnReader::_read_column_data(doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:508 apache#10 doris::vectorized::ScalarColumnReader::_next_value(doris::vectorized::ICollumn*, unsigned long, unsigned long*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:699 apache#11 doris::vectorized::RowGroupReader::_read_column_data(doris::vectorized::Block*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> &, std::vector<doris::vectorized::ColumnSelectVector>*, unsigned long, unsigned long*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:425 apache#12 doris::vectorized::RowGroupReader::get_next_block(doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:311 apache#13 doris::vectorized::ParquetReader::get_next(doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/format/parquet/vparquet_reader.cpp:533 apache#14 doris::vectorized::VFileScanner::_get_next_reader_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/scan/vfile_scanner.cpp:368 apache#15 doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/scan/vfile_scanner.cpp:411 apache#16 doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/scan/vscanner.cpp:431 apache#17 doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/scan/vscanner.cpp:96 apache#18 doris::vectorized::ScannerScheduler::submit(doris::vectorized::ScannerContext*, std::shared_ptr<doris::vectorized::ScanTask>) at /mnt/disk1/yy/git/enterprise-core/be/src/vec/exec/scan/scanner_context.cpp:96 apache#19 doris::Thread::supervise_thread(void*) at /mnt/disk1/yy/git/enterprise-core/be/src/util/thread.cpp:499 apache#20 start_thread apache#21 clone in /lib64/libc.so.6 ``` ### Solution It is not known why the parquet dictionary page will be null in this case, causing a crash. This PR adds defensive code to prevent the crash.
## Proposed changes Backport #40643
## Proposed changes Backport #40643
…#41074) ## Proposed changes Backport apache#40643
Proposed changes
Issue
Solution
It is not known why the parquet dictionary page will be null in this case, causing a crash. This PR adds defensive code to prevent the crash.